Violations of Bell inequalities as lower bounds on the communication cost of non-local 

correlations 
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To reproduce in a local hidden variables theory correlations that violate Bell inequalities, commu- 
nication must occur between the parties. We show that the amount of violation of a Bell inequality 
imposes a lower bound on the average communication needed to produce these correlations. More- 
over, for every probability distribution there exists an optimal inequality for which the degree of 
violation gives the minimal average communication. As an example, to produce using classical 
resources the correlations that maximally violate the CHSH inequality, v2 — 1 ~ 0.4142 bits of 
communication are necessary and sufficient. For Bell tests performed on two entangled states of 
dimension d > 3 where each party has the choice between two measurements, our results suggest 
that more communication is needed to simulate outcomes obtained from certain non-maximally 
entangled states than maximally entangled ones. 

PACS numbers: 



I. INTRODUCTION 

Characterizing the features of quantum mechanics 
which differentiate it from classical theories is an impor- 
tant issue for quantum information theory, as well as from 
a fundamental perspective. One such peculiarity is the 
non-local character of quantum mechanics, i.e. the fact 
that quantum correlations are incompatible with local 
realistic theories. Apart from being one of the most in- 
triguing aspects of nature, non-locality is deeply related 
to several quantum information processing tasks Q, , 
and is at the core of quantum communication complex- 
ityjia. 

It was Bell who first showed that correlations ob- 
tained by measuring two separated subsystems cannot be 
explained by a classical realistic theory if no communi- 
cation between the subsystems is allowed. The question 
which then follows is: how much communication is re- 
quired to reproduce these correlations ? This is a natu- 
ral way to quantify the non-local character of quantum 
correlations in terms of classical resources. We will show 
that the inequalities introduced by Bell forty years ago 
not only tell us that some communication is necessary to 
produce the correlations but also how much. 

The situation we consider is the one encountered in 
bipartite Bell scenarios. Two spatially separated parties, 
Alice and Bob, receive local inputs x and y and subse- 
quently produce outputs a and b. We denote by Ma 
the number of possible inputs on Alice's side and by Mg 
the number of inputs on Bob's side and restrict ourselves 
to the case where a finite number of distinct outcomes 
is associated to each input. The scenario is completely 
characterized by the probabilities p a b\xy that Alice out- 
puts a when given x and Bob outputs b when given y. 
We therefore associate to each Bell scenario a correla- 



tion vector p with entries p a b\xy Note that these entries 
satisfy the normalisation constraints 
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In the quantum version of the Bell scenario, Alice and 
Bob share an entangled quantum state on which they 
perform local measurements. The inputs x and y then 
correspond to the possible settings of their measuring 
apparatus and the outcomes a and b correspond to the 
results of these measurements. 

In the classical version of the Bell scenario, Alice and 
Bob may use only classical resources, i.e shared random- 
ness (local hidden variables) and classical communica- 
tion, to determine their outcomes a and b. If the two 
parties have unrestricted access to shared randomness, 
the classical cost of producing the correlations p is the 
minimum amount of communication they must exchange 
in a classical protocol to achieve this goal. Different mea- 
sures of this amount of communication are possible: 

i) C w (p): Worst case communication: the maximal 
amount of communication exchanged between Alice and 
Bob in any particular execution of the protocol. See 0, 

n|@_. 

ii) C(p): Average communication: the average com- 
munication exchanged between Alice and Bob, where the 
average is taken over the inputs and the shared random- 
ness. See 

iii) Coo(p): Asymptotic communication: the limit 
lim ?woo C(p® n )/n, where p®" is the probability distri- 
bution obtained when n runs of the Bell scenario are 
carried out in parallel, that is when the parties receive n 
inputs and produce n outputs in one go. See [l3| . 

In each of these definitions the costs arc defined with 
respect to the optimal protocol that gives the lowest value 
for each quantity. 

The asymptotic measure may be the most appro- 
priate when one is concerned with practical applications 
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that make use of the correlations but is less preoccupied 
whether the measurements are performed individually or 
collectively. On the other hand, the first two measures of 
communication relate to protocols where the outcomes 
are determined after each single pair of inputs is cho- 
sen. This is in particular the situation encountered in 
Bell tests. These two measures thus more properly count 
the communication necessary to simulate classically non- 
locality and it could be expected that they are closely 
connected to Bell inequalities. Relations between the 
worst case situation and Bell inequalities were examined 
in Q where the authors introduced new Bell inequalities 
that are satisfied by correlations that necessitate at most 
1 bit of communication to be simulated. 

In the present paper, we concentrate on the average 
communication C . We first point out that the amount 
by which the probabilities p violate a Bell inequality im- 
poses a lower bound on C(p). This bound is simply a 
bound on the amount of communication needed to clas- 
sically simulate a violation of the inequality. It is a priori 
unclear that one particular manifestation of the non-local 
content of correlations, the violation of a specific Bell in- 
equality, suffices to characterize exactly the communica- 
tion C (p) necessary to reproduce the entire set of corre- 
lations (all the less since in general correlations violate 
more than one inequality). Yet, to each probability dis- 
tribution p is associated an optimal inequality such that 
the bound the violation imposes on C(p) is saturated, 
i.e. it gives the minimal average communication needed 
to reproduce these correlations. We then investigate in 
detail the case of the CHSH inequality Q. We show 
that for two settings and two outcomes Bell scenarios, 
the CHSH inequality is optimal for all quantum correla- 
tions. This implies in particular that \/2 — 1 ~ 0.4142 
bits are necessary and sufficient on average to reproduce 
classically the correlations that lead to the maximal vio- 
lation of the inequality. We then apply our approach to 
the CGLMP inequality [15(. We find that for two mea- 
surements scenarios more communication is needed to 
reproduce the effect of measuring certain non-maximally 
entangled states of two qutrits than is necessary for max- 
imally entangled ones. Our results, combined with those 
of [l(j, suggest that this is also the case for qudits with 
d > 3. Finally we ask whether for quantum correlations 
the optimal inequalities from the communication point of 
view are always facet inequalities. We give an example 
where this is not the case. 

This paper is organized as follows. We first describe 
in section II how the average communication C relates 
to the degree of violation of Bell inequalities. We then 
apply these ideas to the CHSH inequality in section III 
and to the CGLMP inequality in section IV. In section 
V, we discuss the relations between optimal and facet 
inequalities. 



II. GENERAL FORMALISM 
A. Deterministic protocols 

To state our results it is necessary to consider particu- 
lar classical protocols, the deterministic ones which don't 
use any kind of randomness. These protocols therefore 
always produce the same pair of outcomes for given in- 
puts x and y. The entries of the associated correlation 
vector d are thus of the form cLm™, = S a , \5 b a , % where 

ao\xy a[x.y) p{x,y) 

a(x, y) and (3(x, y) specify Alice's and Bob's outcomes for 
measurements x and y. Since there are a finite number of 
functions a(x,y) and (3(x,y), there are a finite number 
of different deterministic strategies d A which we index 
by A. Their interest is that any classical protocol can be 
viewed as a probability distribution {q\} of determinis- 
tic protocols d A . That is any correlation vector p can be 
written as p = J2\ 9Ad A where q\ > and ^ A q\ = 1. 

Deterministic protocols for which a and /3 depend only 
on the measurements performed locally by each party, i.e. 
a — a(x), fi = /3(y), are local protocols. No communica- 
tion at all is required to implement them. On the other 
hand, if a(x,y) or (3(x,y) depends on the input of the 
other party, some (deterministic) communication c(x, y) 
between the parties is necessary to carry out the protocol. 

It will be convenient to group in subsets T>i determin- 
istic strategies that need the same comunication c, to 
be implemented. Since in the present paper we are in- 
terested in the average communication C, we will group 
the deterministic strategies with respect to the minimal 
average communication needed to implement them, ex- 
pressed in bits. Indexing strategies in T>i by Aj, we 
thus have C(d Xi ) — Cj VAj. We also arrange the sub- 
sets T>i (i = 0, . . . N) in increasing order with respect 
to their communication cost: c% < Cj+i. Local deter- 
ministic strategies thus belong to T>o for which co=0, 
while the maximum communication cost cat is associated 
with strategies in T>n- This occurs when both parties 
need to send the value of their input to the other, so 
cjv = log 2 Ma + log 2 Mb- We will further illustrate this 
grouping of deterministic strategies in section IV. 

With the above notation, a decomposition of p in term 
of deterministic strategies can be written 

P = EE^.d A '. (2) 

i Aj 

It then directly follows that the average communication 
C(p, {q\}) associated to the protocol is given by 

C(p,{qx}) = EE^^( dA> ) 

i A; 

= EE 9a * c * = E* Ci ( 3 ) 

i X t i 

where = Ylx- is the probability to use a strategy 
from T>i. The minimum amount of communication C(p) 
necessary to reproduce the correlations p is the minimum 
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of C(p, {q\}) over all possible decompositions of the form 
|[!2J|. If there exists a decomposition such that q = 1, i.e., 
if the correlations can be written as a convex combination 
of local deterministic strategies, then C(p) — and the 
correlations are local. If for every decomposition qp < 
1, the correlations are non-local and they violate a Bell 
inequality. 



B. Bell inequalities 

A Bell inequality can be viewed as a vector b which 
associates to each probability distribution p a number 
B(p) = b • p. One particular number is the local bound 
Bp = maxA n {b • d °}. By convexity, every local proba- 
bility distribution I — ^2 Xo l^o^ satisfies the inequality 
B{1) < Bp. Correlations p that violate it, B(p) > Bp, 
are therefore non-local. To extract more information 
from B(p) than a simple detection of non- locality it is 
necessary to consider not only the upper bound Bp the 
inequality takes on the local subset Vp, but also on all 
the other subsets T>f. 



Bi = max{b • d A '} 



(4) 



Given this extra knowledge, a constraint on the decom- 
position J2J can be deduced from the amount by which 
p violates the Bell inequality. This turns into a bound 
on C(p) which is the basis of the present paper. 



C. Main results 

Proposition 1. For every inequality b and probability 
distribution p, the following bound holds: 



- B(p) - B 
C(p) > — — Cj 

£>■>* — on 



(5) 



where j* is the index such that (Bj* — Bp)/cj* = 
maxjjt {(Bj - B )/cj}. 

Proof. From (J2J) and (JU, we deduce B(p) = b p = 

J2i Ea ; <?A ; b • d Ai < J2i 1i B i- Since Ei 1i = 1 we nnd 

B(p)-B <J2<li( B i- B o) ( 6 ) 

i=£0 

or 



1j 



, > 



B{p) - Bp _ ^ Bj — B 
Bj* — Bp r^ 1 -, Bj, — Bp 



We thus obtain 
C{p) = ^2q t Ci 

i 

B(p) - B 



> 



Bi* — Bp 



E 



Bj — Bn 



q% en 



B(p) - Bp 
> c 

Bj* — Bp 



3* 



(8) 



where in the last line we used {Bj* — Bp)/cj* > {Bi — 
Bp)/ci which follows from the definition of j*. □ 

The bound JSJ the inequality b imposes on the aver- 
age communication C(p) is proportional to the degree 
of violation B(p) times a normalisation factor D Cj * D 

Uj*—tS 

expressed in units of "communication per amount of vi- 
olation" . This naturally suggests to rewrite Bell inequal- 
ities in natural units where 



Ba* —Bq 



1 so that © takes 



a simpler form: 



Proposition 2. Every Bell inequality b can be rewritten 
in a normalised form b' such that B[ < Ci Vi. For the 
normalised inequality the bound |3|) becomes 



C(p) > B'(p) 



(9) 



Proof. Define the normalised version of the inequality b 
as 



b' = 



3* 



Bj* - Bp 



h 



B t 



o 



M A M B ' 



(10) 



where j* is taken as in Proposition 1. Note that 1 • 
p = MaMb since the entries of all correlations vector p 
satisfy the normalisation constraints The effect of 
the term — m ^% Ib 1 in (fTUfl is thus to shift the value the 
inequality takes on an arbitrary vector p from B(p) to 
B(p) - Bp. We therefore get B[ = max As {b' • d A '} = 
B~r~B^(Bi — Bp) < Ci where the last inequality holds by 
definition of j*. 

We then immediately deduce @ since B'(p) = b' p = 
Ei Ea, ?A 4 b' • d A - < QiBi < QiCi = C(p). □ 

Assuming Bell inequalities are written in this standard 
way where Bi < Cj, it follows from that for a given 
probability distribution p, the inequality that leads to 
the strongest bound on C(p) is the one for which B(p) 
takes the greatest value. In fact we have: 

Proposition 3. Let b* be the normalised inequality that 
gives the maximum violation -B»(p) = maxt,{-E>(p)} for 
the correlations p, then 



G(p)=£?*(p) 



(11) 



Proof. This follows from the duality theorem of linear 
programming 01 • Indeed B*(p) is the solution to the 
following linear programming problem: 

max b • p 

subject to b • d Ai < a VAo, . . . , A,, . . . , Aat (12) 

for the variable b. The dual of that problem is 
min ^2 2_j CiqXi = E ° iqi 

i Xi i 

subject to \_. /J 9Aid A ' = p 

i Xi 

q\ z > VA , . .., Xu Xn (13) 
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for the variables . The solution to the dual problem is 
C(p) since it just amounts to search for the optimal de- 
composition {(/A;} of p which leads to the lowest average 
communication (note that the condition X^Ea 1^ = 1 
is in fact already implied by the normalisation conditions 
that d Ai and p satisfy) . Now, the duality theorem of lin- 
ear programming states that if the primal (dual) has an 
optimal solution, then the dual (primal) problem also 
has an optimal solution and moreover the two solutions 
coincide, i.e. -B*(p) = C(p). □ 

This last result introduces the concept of an optimal 
inequality b* from the communication point of view for 
the correlations p. Indeed the bounds JSJ and J3J can be 
interpreted as bounds on the communication necessary 
to simulate classically a violation of the inequality b by 
the amount B(p). Of course this is also a bound on the 
average communication C (p) necessary to reproduce the 
entire set of correlations p. In general however, more 
communication may be necessary to carry out the lat- 
ter task than the former. For the optimal inequality b», 
though, the communication is identical in the two cases. 
If we quantify non-locality by the amount of communi- 
cation needed to simulate it classically, a violation of the 
inequality b* by the amount i?» (p) therefore exhibit the 
complete non-locality contained in the correlations p. 



D. Comparing Bell inequalities 

The bound (JSJ simply expresses that the most efficient 
strategy to simulate a violation of a Bell inequality uses 
local deterministic protocols (which don't necessitate any 
communication) and deterministic protocols from T>j* for 
which the ratio of violation per communication (Bj* — 
Bo)/cj* is maximal. Indeed, for that strategy a violation 
of B(p) = (1 — Qj» )B + qj'Bj* implies 



qj, = 



B(p) - B 
B 



B 



(14) 



3* 



and thus a communication C = 



_ B(p)-B a 
Qj* c j* — Bj.-Bo C i* 

which is nothing more than the right-hand side of JSJ. 

The bound J5J) can thus be viewed as the minimal 
communication needed to produce a given violation of 
the inequality b. This allows us to compare the amount 
of violation of different Bell inequalities, possibly corre- 
sponding to different Bell scenarios. If the inequalities are 
normalised so that Bi < Ci, the bound 10 holds and the 
comparison is even more direct: the greater the violation, 
the greater the non-locality exhibited by the inequality. 

This way of weighing Bell inequalities is correct how- 
ever only if B(p) < Bj*. Indeed if this is not the case, 
the strategy just described no more works since in 114|) 
<7j. > 1. Though the bounds JSJ and @ are still valid, 
it is then in principle possible to infer strongest bounds 
from the violation of the Bell inequality. This should be 
taken into account when comparing Bell inequalities in 
this way. 



In the remainder of the paper, we will only be con- 
cerned with two settings Bell scenarios. Note that in 
that case, B(p) < Bj* is always satisfied for quantum 
correlations. Indeed the minimal possible communica- 
tion in a (non-local) deterministic protocol is 1 bit and 
is associated with strategies in 2?i . However every quan- 
tum correlation of a two settings Bell scenario can be 
reproduced with 1 bit of communication (indeed it suf- 
fices for one of the parties to send her input to the other 
so that they are able to simulate classically the quantum 
correlations). It therefore follows that B(p) < Bi < Bj*. 



E. Other measures of communication 

The general arguments we presented in this section 
remain valid independently of the precise way commu- 
nication is counted and the way determinist strategies 
are accordingly partitioned. Depending on the physical 
quantity one is interested in, different measures for the 
communication cost Ci are thus possible. For example 
to obtain bounds on the average communication needed 
to reproduce quantum correlations in classical protocols 
that use only 1-way communication, the cost of deter- 
ministic strategies using 2-way communication would be 
taken to be c = oo. Our results therefore apply to all 
averaged-type measures of communication. 

Note that one can also count the communication using 
Shanon's entropy if it's assumed that the parties may 
perform block coding. This is natural for instance if the 
parties perform several run of the protocol at once as 
in the definition of the asymptotic communication Coo. 
The resulting bound however will not be a lower bound 
on the asymptotic communication Coo- This is because 
for Bell scenarios corresponding to n runs in parallel, 
there are deterministic strategies than can't be written 
as the product of n one-run deterministic strategies. As n 
increases, there thus exist new ways of decomposing the 
correlations in term of deterministic protocols that can 
possibly result in lower communication per run but which 
are not taken into account in the one-run decomposition 

©■ 

Finally, note that computing the communication costs 
associated to deterministic strategies is in general a dif- 
ficult task. It is a particular problem of the field of com- 
munication complexity for which several techniques have 
been specially developed [lj| ■ However in the case of the 
CHSH and the CGLMP inequality, the bound © can 
easily be deduced. 



III. CHSH INEQUALITY 

Let us now focus on the simplest inequality, the CHSH 
inequality [l4|. The CHSH inequality refers to two set- 
tings and two outcomes Bell scenarios. The value the 
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inequality takes on an arbitrary vector p is 
B(p) = 

p(a = b ) +p{b Q ^ ai) +p(ai = 61) + p(h = a ) 
-\p(a ^ bo) +p(b = ax) +p(a x ^ b x ) + p{bi ^ a Q )] 

(15) 

where p(a x = b y ) = p 0Q \ xy + Pu\ xy and p(a x ^ b y ) = 
Pio\xy+Poi\xy The local bound of this inequality is Bo = 
2. The maximal violation of the CHSH inequality by 
quantum mechanics is 2y/2 and is obtained by performing 
measurements on Bell states. On the other hand, the 
maximum value it can take for all possible distributions 
is 4 when the four terms with a plus sign are equal to 
one. 

To derive a bound on C(p) from l|15|). we need to 
compute maxj^o{(^ — Note that in a deter- 

ministic protocol, either the two parties don't commu- 
nicate at all, or one of the parties start speaking to the 
other. In the latter case, the minimum communication 
she can send is 1 bit. This implies that the minimum 
possible average communication for non-local determin- 
istic strategies is ci = 1. The following protocol with 
entries cLm t „ = 5 a , ■,S„, ■,. where 

au\xy a(x,y) p(x,y)> 

a(x,y)=0 forx,y = 0, 1 

/3(0,0) = /3(1,0) = 1 /3(0,1) = /3(1, 1) = [ ' 

can be implemented with 1 bit of communication. Indeed 
it suffices for Alice to send the value of her input to Bob. 
Moreover, the value B(d) it takes on the inequality i|15|) 
is the maximum possible B(d) = 4. It thus follows that 
maXjjLo{(Bj - B )/cj} = (4 — 2)/l = 2, so that for the 
CHSH inequality the bound © becomes 

C(p) > l -B{p) - 1 . (17) 

This implies for instance that to reproduce the optimal 
quantum correlations at least y2 — 1 ~ 0.4142 bits of 
communication are necessary. Note that to reproduce all 
possible von Neumann measurements on a Bell state 1 
bit is sufficient 

Is it possible to find a protocol that reproduces these 
correlations with that amount C(p) — \ 2— 1 of commu- 
nication ? It turns out in fact that the CHSH inequality 
is optimal, i.e., the bound l|17fl is saturated, for all quan- 
tum correlations. Indeed, quantum correlations satisfy 
the no-signalling conditions: 

^2 Pab\xy = ^2 P*b\xy> Vy, y' 
b b 

^2Pab\xy = ^2Pab\x'y Vx, x' (18) 

a a 

which express that Alice's marginal probabilities are in- 
dependent of Bob's input and conversely. For correla- 
tions that obey these constraints, we have: 



Proposition 4. C(p) = \B(p) — 1 bits of communica- 
tion are necessary and sufficient to simulate probabilities 
p that violate the CHSH inequality li,5j) and satisfy the 
no- signalling condition flty) . 

Proof. As the "necessary" part follows from the bound 
(|15fl . we just have to exhibit a classical protocol that 
reproduces the correlations with that amount of commu- 
nication. 

First note that when the bound © is saturated, it 
follows from the proof of Proposition 1 that the opti- 
mal protocol uses only strategies from Uq and 2?j* and 
moreover in these subsets only strategies that attain the 
maximal values B and B.^ on the inequality b (there 
could be more than one subset if they are several 
indexes j„ for which (Bj* — Bo)/cjs, is maximum). In 
our case, this implies that the optimal protocol must be 
built from local strategies d Ao and from 1-bit strategies 
d Al such that b • d A ° = B = 2 and b • d Al = B x = 4. 

The entries of the vectors p corresponding to the Bell 
scenario associated with the CHSH inequality consists of 
16 probabilities p a b\xy since a, 6, x and y each take two 
possible values. Half of these probabilities appear with a 
plus sign in the CHSH expression (|15fl and half of them 
with a minus sign. Since entries d ab \ xy = $%( x ,y)tfi(x,v) 
of deterministic strategies are either equal to or 1 , for 
a deterministic strategy d to satisfy B(d) = 2, it must 
contribute to (|15fl with one — and three +. For local 
strategies, which assign local values a(x) and (3(y) to Al- 
ice's and Bob's outcomes, this leaves eight possibilities. 
Indeed if we choose one of the eight entries appearing 
in (|15fl with a — sign to be equal to one, the require- 
ment that three entries appearing with a + sign must 
also be equal to one, fully determines the functions a(x) 
and (3{y). The resulting eight possible local strategies 
d A ° (Ao = 0, ... 7) are given in Table I. On the other 
hand, for a deterministic strategy to attain B(d) — 4, it 
must contribute to (|15f) with four terms weighted by a 
+. The assignment of outcomes of 1-bit strategies d Al 
are either of the form a{x), (3{x 1 y) (when Alice sends 
her input to Bob), or a(x,y), (3{y) (when it is Bob who 
sends his input to Alice). For each of the four possible 
functions a(x), the requirement that all the entries of the 
deterministic vector equal to one appear with a + in the 
CHSH inequality fixes the function /3(x, y) and similarly 
for the four possible functions (3(y). There are thus eight 
protocols in T>\ that attain the bound B\ = 4. These 
strategies are given in Table II. 

Having characterized the deterministic strategies from 
which the protocol is built, it remains to determine the 
probabilities q\ with which these strategies are used. 
These must be chosen so that 

7 7 

Pab\xy = 1^dab\ X y + J2 ^ d al\xy ( 19 ) 

A o =0 Ai=0 

holds for the 16 entries p a b\ X y Let's focus first on the 
entries that enter in l|15[) with a — sign. For each of 



TABLE I: The eight local deterministic strategies for which 
B(d) = 2 
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these eight entries, the only contribution to the right- 
hand side of Ijl9(l different from zero comes from a local 
deterministic strategy d A ° . This therefore fixes the value 
of the corresponding probability q\ . For instance qo = 

Poo|io OT 9i = Pxo|xx- 

We now have to determine the value of the probabilities 
q\ t so that the eight entries p a b\ xy that enter l|15|l with 
a + sign satisfy l|19|) . For simplicity let's focus on one 
of these entries: Poo|oo- Using Tables I and II, equation 
(11911 becomes 

Poo|oo = QOo + Qlo + Via + 90i + 92i + <74i + q%x (20) 

or 

90! +92i + 94i + 96i = P0O|0O-P00|X0-Pl0|ll-P0X|01 ( 21 ) 

where we replaced each of the probabilities q\ with their 
value previously determined. From l|15|l and using the 
no-signalling conditions (|18|l and the normalisation con- 
ditions QJ, it is not difficult to see that the left-hand 
side of this equation is equal to (B(p) — 2)/4. The same 
argument can be carried for all the seven other entries 
that contribute to the CHSH inequality with a + sign, 
each time finding that the sum of four probabilities q\ t 
equals (B{p) - 2)/4. Taking q Xl = (B(p) - 2)/16 for 
Ai = 0, . . . , 7 one therefore obtains a solution to (jT§|> . 

The communication associated to this protocol is thus 
C = Ex1xC(d*) = j:l i=0 q Xl =±B(p)-l. □ 



IV. MORE DIMENSIONS: THE CGLMP 
INEQUALITY 

The CGLMP inequality [H generalises the CHSH in- 
equality for d-dimensional systems. This inequality refers 
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TABLE II: The eight 1-bit deterministic strategies for which 
B(d) = 4 
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to measurement scenarios where Alice's and Bob's local 
settings take two values x, y = 0, 1 and each measure- 
ment gives d possible outcomes a, b — 0,...,d— 1. The 
value the CGLMP inequality takes on an arbitrary vector 
p is 

fe=X v 7 
(P(a = b + k) + P(b =ax + k+l) 

+P(oi = h + k) + P(bi = a + k) 
-[P(a = b - k - 1) + P(b = oi - fc) 

+P(ai = h - k - 1) + P(h = a -k- 1)]) (22) 

where P(a x = b y + k) = J2b=o P(b+k)b\x,y is the probabil- 
ity that Alice and Bob results satisfy a = b + k mod d 
when measuring x and y. As shown in |15| . the local 
bound of the inequality is Bq — 2. 

When d = 2 we recover the CHSH inequality and in 
that case the maximal quantum violation is B\ 1E ~ 
2.828. For d > 2, the (conjectured) maximal viola- 
tions obtained from maximally entangled qudits are given 
in [15|. For qutrits the maximum is B\ IE ~ 2.8729 
and this value increases with d. This suggests that the 
CGLMP inequality exhibits stronger non-local correla- 
tions for larger d. This has been made more precise by 
connecting the violation of the CGLMP inequality to the 
resistance of the correlations to the admixture of noise 
|l5|. It nas however been argued in 0], that the re- 
sistance to noise is not a good measure of non-locality. 
Quite surprisingly it was also found in |16| that for d > 2 
the strongest violation of the CGLMP inequality is ob- 
tained using certain non-maximally entangled states. For 
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qutrits, for instance, the maximal violation obtained from 
a non-maximally entangled state is B^ ME ~ 2.9149 
which is higher than the maximum B\ IE ~ 2.8729 for 
the maximally entangled one. Moreover, this discrep- 
ancy between maximally and non-maximally states grows 
with the dimension. This raises several questions on how 
one should interpret and compare these manifestations 
of non-locality. 

A natural answer is through the bound JSJ • The deriva- 
tion of the bound for the CHSH inequality in the previous 
section can directly be applied to the CGLMP inequality. 
This yields 

C d (p) > l -B d {p) 1 . (23) 

This bound is the same for all the inequalities of the fam- 
ily (|22|l . and the strength of these different inequalities 
can therefore simply be measured by the degree by which 
they are violated. This confirms the intuition that the 
non-locality displayed by the CGLMP inequality grows 
with the dimension. 

On the other hand, the fact that for d > 2 the CGLMP 
inequality is maximally violated for non-maximally en- 
tangled states translates into more severe constraints on 
the average communication necessary to reproduce cor- 
relations obtained by measuring certain non-maximally 
entangled states than maximally entangled ones. For in- 
stance, for qutrits (|23[1 implies that Cme > 0.4365 while 
Cnme > 0.4575. It could however be that for maximally 
entangled states the CGLMP inequality is not optimal 
and that another inequality will impose stronger bounds 
for states that are maximally entangled. 

To verify that assertion, we numerically solved the lin- 
ear programming problem (|13|l for the correlations that 
maximally violate the CGLMP inequality both on max- 
imally and non-maximally entangled states for d < 8. 
There exist many different algorithms for linear program- 
ming and the only difficulty in solving (|13|l is to charac- 
terize the sets T>i of deterministic strategies and their 
corresponding communication costs c^. A deterministic 
strategy assigns a definite value a(x, y) to Alice's out- 
comes and /3(x, y) to Bob's outcomes for each of the four 
possible pair of inputs {x,y}, To simplify the notation 
we write a x (y) = a(x,y) and (3 y (x) = j3(x,y). There are 
two possibilities for a x : either a x is constant (est), i.e., 
<*a:(0) = Oi x (l), and given input x Alice does not need 
any information from Bob to determine her output; or 
a x ^ est, that is a x (0) ^ and Alice's outcome 

depends not only on her local setting x but also on Bob's 
one. In that case Alice needs one bit of information from 
Bob to output her result. The situation is similar for 
Bob. This leads to four possible sets of deterministic 
strategies: 

i) 2?o : the set of local deterministic strategies for which 
a x = est and f3 y — est for x = 0, 1 and y = 0, 1. These 
don't need any communication to be implemented: cq — 
0. 

ii) V\\ the strategies where a x = est for x = 0, 1 and 
at least one of the f3 y ^ est. These strategies necessitate 



1 bit of communication from Alice to Bob. This set also 
contains the reverse strategies which need 1 bit of com- 
munication from Bob to Alice. The communication cost 
associated to T>\ is therefore c\ = l. 

iii) T>2- the protocols where a x = est for one of the 
two values x — or x ~ 1, a x ^ est for the other value x 
and at least one of the j3 y ^ est. These strategies can be 
implemented by Alice sending one bit to Bob, the value 
of her input, and then Bob sending back to Alice the 
value of his input if Alice's input equals x. The average 
communication exchanged is 3/2 bits so that C2 = 3/2. 
This set also contains the strategies where Alice's and 
Bob's positions are permuted. 

iv) T>^: a x ^ est and (3 y ^ est for x = 0, 1 and y = 0, 1. 
To implement these strategies both parties need to know 
the input of the other, so C3 = 2. 

With this assignment of communication costs to de- 
terministic strategics and for the correlations considered 
(d < 8), it turns out from the results of the numerical 
optimisation (|13l) that the CGLMP inequality is opti- 
mal, i.e. the bound l|23|l is saturated. For these partic- 
ular measurements, those that gives rise to the maximal 
violation of the CGLMP inequality, more communica- 
tion is thus necessary to reproduce results obtained on 
non-maximally entangled states than maximally entan- 
gled ones. 

It is nevertheless possible that these measurements are 
not optimal to detect the non-locality of maximally en- 
tangled states. We performed numerical searches for 
d = 3, optimising the two von Neumann measurements 
the parties may choose. We found that the measure- 
ments that necessitate the maximal communication to 
be simulated are the ones that maximise the CGLMP 
inequality. These results therefore suggest that two mea- 
surement settings on each side do not optimally detect 
non-locality for d > 3. It is still possible that the simu- 
latation of POVMs on maximally entangled states would 
necessitate further communication. However, concurring 
with [l6j . we believe that more settings per site and a 
corresponding new Bell inequality are needed |26| . 

V. OPTIMAL INEQUALITIES AND FACET 
INEQUALITIES 

The CHSH and the CGLMP inequalities are special in- 
equalities: they are facet inequalities. Local correlations 
I are convex combinations of a finite number of points, 
the local deterministic strategies: I = J2x 9A d A °. The 
set of local correlations thus forms a convex polytope. 
Every polytope can be characterized either by its ver- 
tices (the local deterministic strategies) or by its facets 
which are a finite set of inequalities b' (i = 1, . . . , M) 

I = Qx d Xo «■ h z -l<B l Q i = 1,...M (24) 

Facet inequalities thus form a minimal set of inequali- 
ties that fully characterize the local correlations. They 
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can therefore be viewed as tight detectors of non-locality. 
Complete sets of facet inequalities are known in some 
cases 123, El E3- For two settings, two outcomes 
Bell scenarios, the CHSH is the unique (up to symme- 
tries and besides trivial inequalities that are always sat- 
isfied by quantum correlations) facet inequality. It turns 
out that it is also optimal with regard to the average 
communication C for all quantum correlations. For Bell 
scenarios involving more outcomes, we have seen that the 
CGLMP inequality is optimal for certain correlations. 

Is it the case that for quantum correlations, opti- 
mal inequalities are always facet inequalities ? Con- 
sider for instance the following correlations belonging to 
a two settings, three outcomes Bell scenario: Alice and 
Bob share the maximally entangled state of two qutrits 
= 1/V3(|00> + |11) + |22)). The measurements they 
perform consist of each carrying out the transformation 
\i) — > e l ^^\i), followed by a Fourier Transform Uft for 
Alice and Uft* for Bob and then a measurement in the 
computational basis. The settings of their measuring ap- 
paratus are thus determined by the three phases they 
use. For Alice's setting x — and x — 1 the phases are 
(0, 0, 0) and (0, 0, tt/2), while for Bob's settings y = and 
y = 1 they are (0,0, 7r/4) and (0, 0, -tt/4). This results 
in the probabilities 

p(a x = b y ) = (5 + {-ly^hVtj /9 
p(a x = b y + 1) = (2 - {-lY^Vfj /9 
p(a x = b y + 2) = (2 - (-l)'C*.v)V2) /9 (25) 

where f(x, y) = x(y + 1). 

These correlations violate the CGLMP inequality by 
the amount B 3 (p) = |(1 + 2^2) ~ 2.5523. On the 
other hand, consider the inequality (|15fl . which has to be 
viewed now as a three outcomes inequality, i.e. p(a x — 
h y) = J2kPkk\xy and p(a x ^ b y ) = J2k^iPki\xy where 
the sum over k and I runs from to 2. The above 
correlations violate this straightforward generalisation of 
the CHSH inequality to more outcomes by the amount 
B 3c {p) = |(1 + 8^2) ~ 2.7364. Since for both inequali- 
ties C(p) > 5-B(p) — 1, the generalised CHSH inequality 
is stronger than the CGLMP ones for these particular 
correlations. Moreover, numerically solving the linear 
problem l|13|) we found C(p) = 0.3682 so that the bound 
C*(p) > 0.3682 implied by the generalised CHSH is satu- 
rated, i.e. the inequality is optimal. 

The generalised CHSH inequality, however, is not a 
facet inequality. Indeed, for an inequality to be a facet, 
the local deterministic strategies that attain the local 
bound Bo (i.e. the vertices that belong to the facet) must 
generate a space of dimension one less than the dimension 
of the polytope, since they form its boundary. It is shown 
in [2^ that the two settings three outcomes polytope lies 
in a hyperplane of dimension 24. For the inequality Q15[l. 
it is easily checked that there are only 21 local determin- 
istic strategies that attain the limit Bq = 2. They thus 



generate at best a space of dimension 21 which is less 
than the expected value of 23 for l|15fl to be a facet. 

Does there exist a facet inequality that imposes the 
same bound C(p) > 0.3682 as the generalised CHSH in- 
equality ? There exist algorithms that compute all the 
facets of a polytope given its vertices. Using both the 
reverse search vertex enumeration algorithm |24| and the 
double description method (2j| we obtained the complete 
set of facet inequalities of the two settings, three out- 
comes local polytope which consists of 1116 inequalities. 
The correlations described above violate 23 of these in- 
equalities. 

Note that there are various ways of writing a Bell in- 
equality which are equivalent for local and quantum cor- 
relations. Indeed local and quantum correlations satisfy 
the normalisation and no-signalling conditions l|18|) 
which we express as the constraints 

g j • P & j = l,... J (26) 

For probabilities that satisfy these conditions, the in- 
equality b • p < B can be rewritten in the equivalent 
form 

Mo b + £ n jg ?. ) p < + ]T fijGi (27) 
3 / 3 

In particular, with that rewriting, a facet inequality will 
remain a facet inequality and an inequality which is vio- 
lated by correlations satisfying (|26l) will still be violated. 
This can be geometrically understood as follows. Prob- 
abilities that satisfy the constraints (|26|l lie in a hyper- 
plane Q of dimension less than the total dimension of the 
space V of all vectors p. An inequality b • p < B de- 
fines a half-space in V . The fact that for probabilities in 
G, Bell inequalities can be written in different equivalent 
ways corresponds to the fact that they are different half- 
spaces of V that have the same intersection with the hy- 
perplane Q. It is shown in [2^] that the dimension of the 
two settings three outcomes polytope (the set of all local 
correlations), is the same as the hyperplane Q defined by 
the conditions I126II of normalisation and no-signalling. It 
therefore follows that the rewriting (|27|l based on these 
constraints is the unique way to rewrite Bell inequalities 
in an equivalent form for local correlations. 

However, for probabilities which do not satisfy all of 
these constraints, such as non-local deterministic strate- 
gies T>i^Q, the rewritten inequalities (|27|l are not equiva- 
lent to the original one. They will thus lead to different 
bounds on C(p). The strongest bound on C(p) a facet 
inequality b will impose on the correlations p is the so- 
lution to the following linear programming problem for 
the variables [if. 

max fi B(p) + \_. HjG-i 
3 

subject to Mob + Mjg J ■ d Al < c, (28) 
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We numerically solved this linear problem for the correla- 
tions described above and each of the 23 facet inequalities 
they violate. The strongest bound obtained was given by 
the CGLMP inequality and is C(p) > 0.2764. 

This examples shows that there exist quantum corre- 
lations for which the strongest bound on C(p) deduced 
from facet inequalities is lower than the (optimal) bound 
given from a non-facet inequality. This is contrary to the 
common view according to which facet inequalities are 
"optimal" tests of non- locality |2^| . 

VI. CONCLUSION 

In summary, we have shown that the average commu- 
nication necessary to simulate classically a violation of 
a Bell inequality is proportional to degree of violation 
of the inequality. Moreover, to each set of correlations 
is associated an optimal inequality for which that com- 
munication is also sufficient to reproduce the entire set 
of correlations. The key ingredient was to compare the 
amount of violation of Bell inequalities not only with the 
maximum value they takes on local deterministic strate- 
gies, but also on non-local ones that necessitate some 
communication to be implemented. 

Part of the interest of this work is that it gives a physi- 
cal meaning to the degree of violation of Bell inequalities 
and thus provides an objective way to compare viola- 
tion of different inequalities. It also gives a new way to 



view and understand Bell inequalities that could shed 
new light on some of their aspects. For instance, it was 
commonly assumed that facet inequalities are optimal 
tests of non-locality because they are tight "detectors" 
of non-locality. However if we measure non-locality by 
the communication needed to reproduce it, in certain 
situations non-facet inequalities are better "meters" of 
non-locality than are facet ones. 

This work also provides a tool to characterize and 
quantify the non-locality inherent in quantum correla- 
tions. As a result, for instance, for two measurements on 
each side it seems that the correlations that necessitate 
the most communication to be reproduced are obtained 
on non-maximally entangled states rather than on max- 
imally entangled ones for d > 2. It would be interesting 
to know whether this is still the case for more settings 
and if not, what is the corresponding Bell inequality. 
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